AITopics | custom language model

Collaborating Authors

custom language model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit

Soni, Aniket Abhishek

arXiv.org Artificial IntelligenceMar-26-2025

Although speech recognition algorithms have developed quickly in recent years, achieving high transcription accuracy across diverse audio formats and acoustic environments remains a major challenge. This work explores how incorporating custom language models with the open-source Vosk Toolkit can improve speech-to-text accuracy in varied settings. Unlike many conventional systems limited to specific audio types, this approach supports multiple audio formats such as WAV, MP3, FLAC, and OGG by using Python modules for preprocessing and format conversion. A Python-based transcription pipeline was developed to process input audio, perform speech recognition using Vosk's KaldiRecognizer, and export the output to a DOCX file. Results showed that custom models reduced word error rates, especially in domain-specific scenarios involving technical terminology, varied accents, or background noise. This work presents a cost-effective, offline solution for high-accuracy transcription and opens up future opportunities for automation and real-time applications.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.21025

Country:

North America > United States > New York > Kings County > New York City (0.04)
North America > United States > Arkansas (0.04)

Genre: Research Report (0.89)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Create video subtitles with Amazon Transcribe using this no-code workflow

#artificialintelligenceMay-10-2022, 18:24:00 GMT

Subtitle creation on video content poses challenges no matter how big or small the organization. To address those challenges, Amazon Transcribe has a helpful feature that enables subtitle creation directly within the service. There is no machine learning (ML) or code writing required to get started. This post walks you through setting up a no-code workflow for creating video subtitles using Amazon Transcribe within your Amazon Web Services account. The terms subtitles and closed captions are commonly used interchangeably, and both refer to spoken text displayed on the screen.

amazon transcribe, subtitle, workflow, (15 more...)

#artificialintelligence

Genre: Workflow (0.74)

Industry:

Information Technology (0.68)
Retail > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.30)

Add feedback

Amazon Transcribe: Custom Language Model or General model?

#artificialintelligenceJan-2-2022, 07:40:43 GMT

If you are using the Amazon Transcribe service for automated speech recognition (ASR) feature in your project (especially for the English language), you had to decide whether to build a custom language model or a general model provided by AWS transcribe service. It could also be the case that you tried both options in your application. As I had some experience in trying both options in my project, here I am going to share my two cents. You used the general model to transcribe your audio or video files. You noticed that Amazon Transcribe is not able to recognize certain not-so-frequent English words or phrases that have been pronounced by speakers in audio files.

amazon transcribe, custom language model, general model, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language (0.65)

Add feedback

Why Custom Language Models (CLMs) are Needed in Speech Recognition for Kids

#artificialintelligenceAug-6-2021, 21:25:46 GMT

Welcome back to "Lessons from Our Voice Engine," where members of our Engineering and Speech Tech teams offer high level insights into how our voice engine works. Lesson 2 is from Lora Lynn Asvos, a Computational Linguist on our Speech Tech team. CLM stands for "custom language model." As mentioned in Lesson 1, language models are statistical models of language that can predict the next word based on the context. CLMs are language models, as the name implies, but they have a little something extra.

artificial intelligence, clms, natural language, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.40)

Add feedback

Building custom language models to supercharge speech-to-text performance for Amazon Transcribe

#artificialintelligenceOct-1-2020, 20:49:44 GMT

Amazon Transcribe is a fully-managed automatic speech recognition service (ASR) that makes it easy to add speech-to-text capabilities to voice-enabled applications. As our service grows, so does the diversity of our customer base, which now spans domains such as insurance, finance, law, real estate, media, hospitality, and more. Naturally, customers in different market segments have asked Amazon Transcribe for more customization options to further enhance transcription performance. We're excited to introduce Custom Language Models (CLM). The new feature allows you to submit a corpus of text data to train custom language models that target domain-specific use cases. Using CLM is easy because it capitalizes on existing data that you already possess (such as marketing assets, website content, and training manuals). In this post, we show you how to best use your available data to train a custom language model tailored for your speech-to-text use case. Although our walkthrough uses a transcription example from the video gaming industry, you can use CLM to enhance custom speech recognition for any domain of your choosing. This post assumes that you're already familiar with how to use Amazon Transcribe, and focuses on demonstrating how to use the new CLM feature.

amazon transcribe, machine learning, natural language, (19 more...)

#artificialintelligence

Genre: Instructional Material > Training Manual (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (0.75)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Automating project management with deep learning – Towards Data Science

#artificialintelligenceJan-20-2019, 01:45:57 GMT

In the data-driven future of project management, project managers will be augmented by artificial intelligence that can highlight project risks, determine the optimal allocation of resources and automate project management tasks. For example, many organisations require project managers to provide regular project status updates as part of the delivery assurance process. These updates typically consist of text commentary and an associated red-amber-green (RAG) status, where red indicates a failing project, amber an at-risk project and green an on-track project. Wouldn't it be great if we could automate this process, making it more consistent and objective? In this post I will describe how we can achieve exactly that by applying natural language processing (NLP) to automatically classify text commentary as either red, amber or green status.

artificial intelligence, machine learning, natural language, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback